Add an option to not generate precise GC info #75817

MichalStrehovsky · 2022-09-19T00:26:03Z

Follow up to #75803.

If enabled, conservative GC stack scanning will be used and metadata related to GC stack reporting will not be generated. The generated executable file will be smaller, but the GC will be less efficient (garbage collection might take longer and keep objects alive for longer periods of time than usual).

Saves 4.4% in size on a Hello World. I'll take that.

Cc @dotnet/ilc-contrib

Follow up to dotnet#75803. If enabled, conservative GC stack scanning will be used and metadata related to GC stack reporting will not be generated. The generated executable file will be smaller, but the GC will be less efficient (garbage collection might take longer and keep objects alive for longer periods of time than usual). Saves 4.4% in size on a Hello World. I'll take that.

EgorBo · 2022-09-19T01:06:16Z

Some libraries tests are guarded with IsPreciseGcSupported which is currently implemented as "Not mono" - perhaps, worth adding this mode here for NativeAOT?

VSadov · 2022-09-19T01:06:43Z

Interesting. This could be a useful option for platforms that do not fully support stack walking.

Saves 4.4% in size on a Hello World

I am curious at %% savings for larger apps. Assuming that there is fixed native runtime size, larger apps may benefit a bit more.
Is it easy to measure the diff for ilc ?

MichalStrehovsky · 2022-09-19T01:17:42Z

Some libraries tests are guarded with IsPreciseGcSupported which is currently implemented as "Not mono" - perhaps, worth adding this mode here for NativeAOT?

CoreCLR is also capable of running in this mode. It might be worth it if we're adding official testing for this. It's not what I'm doing right now. (I don't know if there's a good way to probe for this.)

I am curious at %% savings for larger apps. Assuming that there is fixed native runtime size, larger apps may benefit a bit more.

It will be once this merges - about 15% of hello world (500 kB of 3.6 MB) is native code that isn't affected by this. This percentage will be smaller for larger apps. But 15% is already a pretty small number.

VSadov · 2022-09-19T01:21:32Z

I don't know if there's a good way to probe for this.

For the testing purposes, since you want to enable this and do a test pass, you could just set it to true temporarily - to reduce noise.

VSadov · 2022-09-19T01:23:19Z

It is not a lot of noise though typically. False positives when doing conservative stack scans are relatively rare.

EgorBo · 2022-09-19T01:31:19Z

It is not a lot of noise though typically. False positives when doing conservative stack scans are relatively rare.

It was pretty annoying when Mono was wired up to run libs tests 🙂

VSadov · 2022-09-19T01:42:31Z

It was pretty annoying when Mono was wired up to run libs tests

It does not need to be frequent for a failure to become annoying. :-)
And I guess they were a few different tests every time.

Statistically though stacks are shallow and in a big(ish) program most objects would be rooted by statics and/or a few long-lived stack roots that hold large portions of the app context.

Conservative scanning is a bigger problem when both stack and heap are conservative. When it is just stacks, there are fewer opportunities.

MichalStrehovsky · 2022-09-19T02:01:27Z

exit code 139 means SIGSEGV Illegal memory access

Cool. I was surprised we can just not generate it and GC.Collect works. It's probably only GC.Collect that works.

Will probably have to leave a breadcrumb somewhere for the runtime not to expect it.

VSadov · 2022-09-19T03:25:27Z

Do we need GC info for exception handling?

jkotas · 2022-09-19T05:34:57Z

garbage collection might take longer and keep objects alive for longer periods of time than usual

We have done some measurements around this some years back. The average perf hit for real world server (compute bound) apps is about 20% rps.

CoreCLR is also capable of running in this mode.

CoreCLR is not reliable (ie will crash intermittently) with conservative stack scanning. Dynamic methods and collectible assemblies are not able to handle situation when unreachable object becomes reachable again.

VSadov · 2022-09-19T06:08:39Z

It would be interesting to do similar comparison with NativeAOT if we have a good benchmark

Compared with CoreCLR there are two differences:

on CoreCLR enabling conservative disables asynchronous GC suspension - it becomes polling-only. (the stack walking machinery starts reporting - "I know nothing, this may be not jitted code"). That can make suspension pauses longer during which neither app nor GC are working, thus impact on throughput.
In NativeAOT conservative stack reporting still supports asyc suspension in "everything managed is a safepoint" mode. That may actually lead to faster suspensions than in regular case.
on CoreCLR enabling conservative mode enables both conservative stack reporting and support for that in GC.
In NativeAOT it is only the first part. The GC support is turned on unconditionally, so the diff from enabling conservative is less.

It is hard to tell how much the effect the differences make. The second is probably insignificant, but there is a chance the first has measurable impact.

The place where CoreCLR turns off async suspension:

runtime/src/coreclr/vm/threadsuspend.cpp

Line 4852 in 774324e

    
           // Conservative GC enabled; behave as if HIJACK_NONINTERRUPTIBLE_THREADS had not been

VSadov · 2022-09-19T06:15:15Z

Actually USE_GC_INFO_DECODER seems to be off only on x86, so perhaps there is no difference between CoreCLR and NativeAOT.

MichalStrehovsky · 2022-09-19T08:21:25Z

Do we need GC info for exception handling?

The crash is happening here:

runtime/src/coreclr/nativeaot/Runtime/windows/CoffNativeCodeManager.cpp

Lines 561 to 574 in 16cb35f

    
           GcInfoDecoder decoder(GCInfoToken(p), DECODE_REVERSE_PINVOKE_VAR); 
        
           INT32 slot = decoder.GetReversePInvokeFrameStackSlot(); 
        
           assert(slot != NO_REVERSE_PINVOKE_FRAME); 
        
           TADDR basePointer = NULL; 
        
           UINT32 stackBasedRegister = decoder.GetStackBaseRegister(); 
        
           if (stackBasedRegister == NO_STACK_BASE_REGISTER) 
        
           { 
        
               basePointer = dac_cast<TADDR>(pRegisterSet->GetSP()); 
        
           } 
        
           else 
        
           { 
        
               basePointer = dac_cast<TADDR>(pRegisterSet->GetFP()); 
        
           }

So we need GC info to unwind reverse P/invokes. Looks like we would need to generate some sort of minimal GC info to allow for that. (There might be more - it's just the crash that I looked at.)

I'm going to close this. It was only worth it if it's reasonably cheap since it would likely stay an obscure undocumented switch anyway.

AndyAyersMS · 2022-09-19T16:49:01Z

Yeah, GC info conveys some non-GC info, so you can't skip emitting it entirely.

MichalStrehovsky added 2 commits September 19, 2022 09:22

Temporarily enable for all tests

ca825f8

dotnet-issue-labeler bot added the area-NativeAOT-coreclr label Sep 19, 2022

ghost assigned MichalStrehovsky Sep 19, 2022

Update optimizing.md

83b13bc

Update Microsoft.NETCore.Native.targets

246de00

MichalStrehovsky closed this Sep 19, 2022

ghost locked as resolved and limited conversation to collaborators Oct 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add an option to not generate precise GC info #75817

Add an option to not generate precise GC info #75817

MichalStrehovsky commented Sep 19, 2022

EgorBo commented Sep 19, 2022

VSadov commented Sep 19, 2022

MichalStrehovsky commented Sep 19, 2022

VSadov commented Sep 19, 2022

VSadov commented Sep 19, 2022 •

edited

Loading

EgorBo commented Sep 19, 2022

VSadov commented Sep 19, 2022

MichalStrehovsky commented Sep 19, 2022

VSadov commented Sep 19, 2022

jkotas commented Sep 19, 2022

VSadov commented Sep 19, 2022

VSadov commented Sep 19, 2022

MichalStrehovsky commented Sep 19, 2022

AndyAyersMS commented Sep 19, 2022

Add an option to not generate precise GC info #75817

Add an option to not generate precise GC info #75817

Conversation

MichalStrehovsky commented Sep 19, 2022

EgorBo commented Sep 19, 2022

VSadov commented Sep 19, 2022

MichalStrehovsky commented Sep 19, 2022

VSadov commented Sep 19, 2022

VSadov commented Sep 19, 2022 • edited Loading

EgorBo commented Sep 19, 2022

VSadov commented Sep 19, 2022

MichalStrehovsky commented Sep 19, 2022

VSadov commented Sep 19, 2022

jkotas commented Sep 19, 2022

VSadov commented Sep 19, 2022

VSadov commented Sep 19, 2022

MichalStrehovsky commented Sep 19, 2022

AndyAyersMS commented Sep 19, 2022

VSadov commented Sep 19, 2022 •

edited

Loading